# 4-bit quantization
Diffucoder 7B Cpgrpo 4bit
DiffuCoder-7B-cpGRPO-4bit is a 4-bit quantized version converted from the Apple DiffuCoder-7B-cpGRPO model, optimized for the MLX framework.
Large Language Model Other
D
mlx-community
218
1
Hunyuan A13B Instruct 4bit
Other
The 4-bit quantization version of Tencent Hunyuan A13B large language model, suitable for instruction following tasks
Large Language Model
H
mlx-community
201
4
Llama3 1 Turkish ChatBot
MIT
A Turkish language educational Q&A chatbot fine-tuned based on the Meta LLaMA 3.1 8B large language model, optimized specifically for Turkish language education scenarios.
Large Language Model Other
L
metehanayhan
176
2
Qwen3 235B A22B 4bit DWQ 053125
Apache-2.0
This is a 4-bit quantized version converted from the Qwen3-235B-A22B-8bit model, optimized for the MLX framework and suitable for text generation tasks.
Large Language Model
Q
mlx-community
200
1
Qwen3 30B A3B Abliterated Fp4
Apache-2.0
This is a 4-bit quantized model of Qwen3-30B-A3B-abliterated, with a parameter scale equivalent to 8B, suitable for text generation tasks.
Large Language Model
Transformers

Q
huihui-ai
103
1
Deepseek R1 0528 Qwen3 8B 4bit
MIT
This model is a 4-bit quantized version converted from DeepSeek-R1-0528-Qwen3-8B, optimized for the MLX framework and suitable for text generation tasks.
Large Language Model
D
mlx-community
924
1
Deepseek R1 0528 Qwen3 8B MLX 4bit
MIT
A large language model developed by DeepSeek AI, optimized with 4-bit quantization, suitable for Apple chip devices.
Large Language Model
D
lmstudio-community
274.40k
1
Deepseek R1 0528 4bit
DeepSeek-R1-0528-4bit is a 4-bit quantized model converted from DeepSeek-R1-0528, optimized for the MLX framework.
Large Language Model
D
mlx-community
157
9
Devstral Small 2505 4bit DWQ
Apache-2.0
This is a 4-bit quantized language model in MLX format, suitable for text generation tasks.
Large Language Model Supports Multiple Languages
D
mlx-community
238
3
Medgemma 4b It 4bit
Other
MedGemma-4B-IT-4bit is a vision-language model specifically designed for the medical field, supporting image and text processing, and suitable for tasks such as medical image analysis.
Image-to-Text
Transformers

M
mlx-community
196
1
Qwen3 235B A22B 4bit DWQ
Apache-2.0
Qwen3-235B-A22B-4bit-DWQ is a 4-bit quantized version converted from the Qwen3-235B-A22B-8bit model, suitable for text generation tasks.
Large Language Model
Q
mlx-community
70
1
Qwen3 4B 4bit DWQ
Apache-2.0
This model is a 4-bit DWQ quantized version of Qwen3-4B, converted to the MLX format for easy text generation using the mlx library.
Large Language Model
Q
mlx-community
517
2
Qwen3 30B A3B 4bit DWQ 05082025
Apache-2.0
This is a 4-bit quantized model converted from Qwen/Qwen3-30B-A3B to MLX format, suitable for text generation tasks.
Large Language Model
Q
mlx-community
240
5
Qwen3 30B A3B 4bit DWQ 0508
Apache-2.0
Qwen3-30B-A3B-4bit-DWQ-0508 is a 4-bit quantized model converted from Qwen/Qwen3-30B-A3B to MLX format, suitable for text generation tasks.
Large Language Model
Q
mlx-community
410
12
Qwen3 30B A3B MNN
Apache-2.0
An MNN model exported from Qwen3-30B-A3B, featuring 4-bit quantization for efficient inference.
Large Language Model English
Q
taobao-mnn
550
1
Qwen3 8B 4bit DWQ
Apache-2.0
Qwen3-8B-4bit-DWQ is a 4-bit quantized version of Qwen/Qwen3-8B converted to the MLX format, optimized for efficient operation on Apple devices.
Large Language Model
Q
mlx-community
306
1
Phi 4 Mini Reasoning MLX 4bit
MIT
This is a 4-bit quantized version in MLX format converted from the Microsoft Phi-4-mini-reasoning model, suitable for text generation tasks.
Large Language Model
P
lmstudio-community
72.19k
2
Josiefied Qwen3 1.7B Abliterated V1 4bit
4-bit quantized version based on Qwen3-1.7B, a lightweight large language model optimized for the MLX framework
Large Language Model
J
mlx-community
135
2
Qwen3 8B 4bit AWQ
Apache-2.0
Qwen3-8B-4bit-AWQ is a 4-bit AWQ quantized version converted from Qwen/Qwen3-8B, suitable for text generation tasks in the MLX framework.
Large Language Model
Q
mlx-community
1,682
1
Qwen3 235B A22B 4bit
Apache-2.0
This model is a 4-bit quantized version of Qwen/Qwen3-235B-A22B converted to MLX format, suitable for text generation tasks.
Large Language Model
Q
mlx-community
974
6
Qwen3 8B 4bit
Apache-2.0
This is the 4-bit quantized version of the Qwen/Qwen3-8B model, converted to the MLX framework format, suitable for efficient inference on Apple silicon devices.
Large Language Model
Q
mlx-community
2,131
2
Qwen3 30B A3B 4bit
Apache-2.0
Qwen3-30B-A3B-4bit is a 4-bit quantized version converted from Qwen/Qwen3-30B-A3B, suitable for efficient text generation tasks under the MLX framework.
Large Language Model
Q
mlx-community
2,394
7
Qwen3 4B 4bit
Apache-2.0
Qwen3-4B-4bit is a 4-bit quantized version converted from Qwen/Qwen3-4B to the MLX format, designed for efficient operation on Apple chips.
Large Language Model
Q
mlx-community
7,400
6
Qwen3 1.7B 4bit
Apache-2.0
Qwen3-1.7B-4bit is a 4-bit quantized version of the Tongyi Qianwen 1.7B model, which has been converted to the MLX framework format for efficient operation on Apple Silicon devices.
Large Language Model
Q
mlx-community
11.85k
2
Qwen3 14B MLX 4bit
Apache-2.0
Qwen3-14B-4bit is a 4-bit quantized version of the Qwen/Qwen3-14B model converted using mlx-lm, suitable for text generation tasks.
Large Language Model
Q
lmstudio-community
3,178
4
Qwen3 4B MNN
Apache-2.0
The 4-bit quantized version of the MNN model for Qwen3-4B, used for efficient text generation tasks
Large Language Model English
Q
taobao-mnn
10.60k
2
Internvl2 5 1B MNN
Apache-2.0
A 4-bit quantized version based on InternVL2_5-1B, suitable for text generation and chat scenarios.
Large Language Model English
I
taobao-mnn
2,718
1
GLM Z1 32B 0414 4bit
MIT
This model is a 4-bit quantized version converted from THUDM/GLM-Z1-32B-0414, suitable for text generation tasks.
Large Language Model Supports Multiple Languages
G
mlx-community
225
2
Dia 1.6B 4bit
Apache-2.0
Dia-1.6B-4bit is a 4-bit quantized text-to-speech model based on the MLX format, converted from nari-labs/Dia-1.6B.
Speech Synthesis English
D
mlx-community
168
4
Hidream I1 Full Nf4
MIT
HiDream-I1 is an open-source image generation foundation model with 17 billion parameters, capable of producing industry-leading images in seconds.
Image Generation
H
azaneko
16.95k
38
Hidream I1 Fast Nf4
MIT
HiDream-I1 is an open-source image generation foundation model with 17 billion parameters. The 4-bit quantized version can run on 16GB VRAM, enabling fast and high-quality image generation.
Image Generation
H
azaneko
19.22k
7
Hidream I1 Dev Nf4
MIT
HiDream-I1 is an open-source image generation foundation model with 17 billion parameters, capable of producing industry-leading images in seconds.
Image Generation
H
azaneko
23.29k
12
Zhaav Gemma3 4B
A Persian-specific model fine-tuned based on the Gemma 3 architecture, utilizing QLoRA 4-bit quantization technology, suitable for running on ordinary hardware.
Large Language Model Other
Z
alifzl
40
1
Qwq 32B NF4
Apache-2.0
This is the 4-bit quantized version of the Qwen/QwQ-32B model, optimized using the BitsAndBytes library, suitable for text generation tasks in resource-constrained environments.
Large Language Model
Transformers English

Q
ginipick
150
27
Qwq 32B Bnb 4bit
Apache-2.0
The 4-bit quantized version of Qwen/QwQ-32B, implemented using the BitsAndBytes library, suitable for text generation tasks in resource-constrained environments.
Large Language Model
Transformers English

Q
fantos
115
4
Gemma 3 4b Persian V0 GGUF
Apache-2.0
This is a statically quantized version of the mshojaei77/gemma-3-4b-persian-v0 model, specifically optimized for Persian text generation tasks.
Large Language Model
Transformers Other

G
mradermacher
162
2
Gemma 3 27b It Quantized W4A16
Gemma 3 is an instruction-tuned large language model developed by Google. This repository provides its 27B parameter W4A16 quantized version, significantly reducing hardware requirements
Large Language Model
Transformers

G
abhishekchohan
640
4
Gemma 3 4b Persian V0
Apache-2.0
A Persian-specific model based on the Gemma 3 architecture, utilizing QLoRA for 4-bit quantization, focused on Persian text generation and understanding
Large Language Model Other
G
mshojaei77
542
9
Olmo 2 0325 32B Instruct 4bit
Apache-2.0
This is a 4-bit quantized version converted from the allenai/OLMo-2-0325-32B-Instruct model, optimized for the MLX framework and suitable for text generation tasks.
Large Language Model
Transformers English

O
mlx-community
270
10
Qwq 32B Bnb 4bit
Apache-2.0
4-bit quantized version of QwQ-32B, optimized using Bitsandbytes technology, suitable for efficient inference in resource-constrained environments
Large Language Model
Transformers

Q
onekq-ai
167
2
- 1
- 2
Featured Recommended AI Models